Characteristic sets profile features: Estimation and application to SPARQL query planning

نویسندگان

چکیده

RDF dataset profiling is the task of extracting a formal representation dataset’s features. Such features may cover various aspects ranging from information on licensing and provenance to statistical descriptors data distribution its semantics. In this work, we focus characteristics sets profile that capture both structural semantic an dataset, making them valuable resource for different downstream applications. While previous research demonstrated benefits characteristic in centralized federated query processing, access these fine-grained statistics taken granted. However, especially computing feature challenging as it can be difficult and/or costly process entire all federation members. We address shortcoming by introducing concept estimation propose sampling-based approach generate estimations feature. addition, showcase applicability querying proposing planning specifically designed leverage estimations. our first experimental study, intrinsically evaluate representativeness estimation. The results show even small samples just 0.5 % original graph’s entities allow estimating properties Our second study extrinsically evaluates investigating their planner using well-known FedBench benchmark. experiments estimated obtaining efficient plans.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SPARQL Query Optimization Using Selectivity Estimation

This poster describes three static SPARQL optimization approaches for in-memory RDF graphs: (1) a selectivity estimation index (SEI) for single query triple patterns; (2) a query pattern index (QPI) for joined triple patterns; and (3) a hybrid optimization approach that combines both indexes. Using the Lehigh University Benchmark (LUBM), we show that the hybrid approach outperforms other SPARQL...

متن کامل

Resource Planning for SPARQL Query Execution on Data Sharing Platforms

To increase performance, data sharing platforms often make use of clusters of nodes where certain tasks can be executed in parallel. Resource planning and especially deciding how many processors should be chosen to exploit parallel processing is complex in such a setup as increasing the number of processors does not always improve runtime due to communication overhead. Instead, there is usually...

متن کامل

SPARQL-DL: SPARQL Query for OWL-DL

There are many query languages (QLs) that can be used to query RDF and OWL ontologies but neither type is satisfactory for querying OWL-DL ontologies. RDF-based QLs (RDQL, SeRQL, SPARQL) are harder to give a semantics w.r.t. OWL-DL and are more powerful than what OWL-DL reasoners can provide. DL-based QLs (DIG ask queries, nRQL) have clear semantics but are not powerful enough in the general ca...

متن کامل

Incremental SPARQL Query Processing

The number of linked data sources available on the Web is growing at a rapid rate. Moreover, users are showing an interest for any framework that allows them to obtain answers, for a formulated query, accessing heterogeneous data sources without the need of explicitly specifying the sources to answer the query. Our proposal focus on that interest and its goal is to build a system capable of ans...

متن کامل

Predicting SPARQL Query Performance

We address the problem of predicting SPARQL query performance. We use machine learning techniques to learn SPARQL query performance from previously executed queries. We show how to model SPARQL queries as feature vectors, and use k -nearest neighbors regression and Support Vector Machine with the nu-SVR kernel to accurately (R value of 0.98526) predict SPARQL query execution time. 1 Query Perfo...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Semantic web

سال: 2023

ISSN: ['2210-4968', '1570-0844']

DOI: https://doi.org/10.3233/sw-222903